3 research outputs found
Neural Machine Translation into Language Varieties
Both research and commercial machine translation have so far neglected the
importance of properly handling the spelling, lexical and grammar divergences
occurring among language varieties. Notable cases are standard national
varieties such as Brazilian and European Portuguese, and Canadian and European
French, which popular online machine translation services are not keeping
distinct. We show that an evident side effect of modeling such varieties as
unique classes is the generation of inconsistent translations. In this work, we
investigate the problem of training neural machine translation from English to
specific pairs of language varieties, assuming both labeled and unlabeled
parallel texts, and low-resource conditions. We report experiments from English
to two pairs of dialects, EuropeanBrazilian Portuguese and European-Canadian
French, and two pairs of standardized varieties, Croatian-Serbian and
Indonesian-Malay. We show significant BLEU score improvements over baseline
systems when translation into similar languages is learned as a multilingual
task with shared representations.Comment: Published at EMNLP 2018: third conference on machine translation (WMT
2018
Transfer Learning in Multilingual Neural Machine Translation with Dynamic Vocabulary
We propose a method to transfer knowledge across neural machine translation
(NMT) models by means of a shared dynamic vocabulary. Our approach allows to
extend an initial model for a given language pair to cover new languages by
adapting its vocabulary as long as new data become available (i.e., introducing
new vocabulary items if they are not included in the initial model). The
parameter transfer mechanism is evaluated in two scenarios: i) to adapt a
trained single language NMT system to work with a new language pair and ii) to
continuously add new language pairs to grow to a multilingual NMT system. In
both the scenarios our goal is to improve the translation performance, while
minimizing the training convergence time. Preliminary experiments spanning five
languages with different training data sizes (i.e., 5k and 50k parallel
sentences) show a significant performance gain ranging from +3.85 up to +13.63
BLEU in different language directions. Moreover, when compared with training an
NMT model from scratch, our transfer-learning approach allows us to reach
higher performance after training up to 4% of the total training steps.Comment: Published at the International Workshop on Spoken Language
Translation (IWSLT), 201